AITopics

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Ohio (0.04)
(4 more...)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Luo, Zhankun, Hashemi, Abolfazl

Structural Properties, Cycloid Trajectories and Non-Asymptotic Guarantees of EM Algorithm for Mixed Linear Regression

arXiv.org Artificial IntelligenceNov-10-2025

This work investigates the structural properties, cycloid trajectories, and non-asymptotic convergence guarantees of the Expectation-Maximization (EM) algorithm for two-component Mixed Linear Regression (2MLR) with unknown mixing weights and regression parameters. Recent studies have established global convergence for 2MLR with known balanced weights and super-linear convergence in noiseless and high signal-to-noise ratio (SNR) regimes. However, the theoretical behavior of EM in the fully unknown setting remains unclear, with its trajectory and convergence order not yet fully characterized. We derive explicit EM update expressions for 2MLR with unknown mixing weights and regression parameters across all SNR regimes and analyze their structural properties and cycloid trajectories. In the noiseless case, we prove that the trajectory of the regression parameters in EM iterations traces a cycloid by establishing a recurrence relation for the sub-optimality angle, while in high SNR regimes we quantify its discrepancy from the cycloid trajectory. The trajectory-based analysis reveals the order of convergence: linear when the EM estimate is nearly orthogonal to the ground truth, and quadratic when the angle between the estimate and ground truth is small at the population level. Our analysis establishes non-asymptotic guarantees by sharpening bounds on statistical errors between finite-sample and population EM updates, relating EM's statistical accuracy to the sub-optimality angle, and proving convergence with arbitrary initialization at the finite-sample level. This work provides a novel trajectory-based framework for analyzing EM in Mixed Linear Regression.

artificial intelligence, log 1, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.04937

Country:

North America > United States (0.46)
Asia (0.45)
Europe > Austria (0.27)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Kipnis, Alex, Binz, Marcel, Schulz, Eric

metabeta - A fast neural model for Bayesian mixed-effects regression

arXiv.org Machine LearningOct-10-2025

Hierarchical data with multiple observations per group is ubiquitous in empirical sciences and is often analyzed using mixed-effects regression. In such models, Bayesian inference gives an estimate of uncertainty but is analytically intractable and requires costly approximation using Markov Chain Monte Carlo (MCMC) methods. Neural posterior estimation shifts the bulk of computation from inference time to pre-training time, amortizing over simulated datasets with known ground truth targets. We propose metabeta, a transformer-based neural network model for Bayesian mixed-effects regression. Using simulated and real data, we show that it reaches stable and comparable performance to MCMC-based parameter estimation at a fraction of the usually required time.

dataset, fast neural model, posterior, (14 more...)

2510.07473

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Ben Lengerich, Bryon Aragam, Eric P. Xing

Learning Sample-Specific Models with Low-Rank Personalized Regression

Neural Information Processing SystemsOct-2-2025, 17:51:08 GMT

Modern applications of machine learning (ML) deal with increasingly heterogeneous datasets comprised of data collected from overlapping latent subpopulations.

artificial intelligence, machine learning, natural language, (21 more...)

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Government > Voting & Elections (1.00)
Banking & Finance (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Neural Information Processing SystemsOct-1-2025, 22:12:25 GMT

Coresets for Regressions with Panel Data

Such data track features of a cross-section of entities (e.g.,

artificial intelligence, coreset, machine learning, (17 more...)

Country: North America > United States (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Gi-Soo Kim, Myunghee Cho Paik

Doubly-Robust Lasso Bandit

Neural Information Processing SystemsAug-20-2025, 04:21:51 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, bastani and bayati, probability, (13 more...)

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Data Science (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Luo, Zhankun, Hashemi, Abolfazl

Characterizing Evolution in Expectation-Maximization Estimates for Overspecified Mixed Linear Regression

arXiv.org Artificial IntelligenceAug-15-2025

Mixture models have attracted significant attention due to practical effectiveness and comprehensive theoretical foundations. A persisting challenge is model misspecification, which occurs when the model to be fitted has more mixture components than those in the data distribution. In this paper, we develop a theoretical understanding of the Expectation-Maximization (EM) algorithm's behavior in the context of targeted model misspecification for overspecified two-component Mixed Linear Regression (2MLR) with unknown $d$-dimensional regression parameters and mixing weights. In Theorem 5.1 at the population level, with an unbalanced initial guess for mixing weights, we establish linear convergence of regression parameters in $O(\log(1/ε))$ steps. Conversely, with a balanced initial guess for mixing weights, we observe sublinear convergence in $O(ε^{-2})$ steps to achieve the $ε$-accuracy at Euclidean distance. In Theorem 6.1 at the finite-sample level, for mixtures with sufficiently unbalanced fixed mixing weights, we demonstrate a statistical accuracy of $O((d/n)^{1/2})$, whereas for those with sufficiently balanced fixed mixing weights, the accuracy is $O((d/n)^{1/4})$ given $n$ data samples. Furthermore, we underscore the connection between our population level and finite-sample level results: by setting the desired final accuracy $ε$ in Theorem 5.1 to match that in Theorem 6.1 at the finite-sample level, namely letting $ε= O((d/n)^{1/2})$ for sufficiently unbalanced fixed mixing weights and $ε= O((d/n)^{1/4})$ for sufficiently balanced fixed mixing weights, we intuitively derive iteration complexity bounds $O(\log (1/ε))=O(\log (n/d))$ and $O(ε^{-2})=O((n/d)^{1/2})$ at the finite-sample level for sufficiently unbalanced and balanced initial mixing weights. We further extend our analysis in overspecified setting to low SNR regime.

artificial intelligence, machine learning, tanh, (19 more...)

arXiv.org Artificial Intelligence

2508.10154

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Simkus, Vaidotas, Gutmann, Michael U.

CFMI: Flow Matching for Missing Data Imputation

arXiv.org Machine LearningJun-12-2025

We introduce conditional flow matching for imputation (CFMI), a new general-purpose method to impute missing data. The method combines continuous normalising flows, flow-matching, and shared conditional modelling to deal with intractabilities of traditional multiple imputation. Our comparison with nine classical and state-of-the-art imputation methods on 24 small to moderate-dimensional tabular data sets shows that CFMI matches or outperforms both traditional and modern techniques across a wide range of metrics. Applying the method to zero-shot imputation of time-series data, we find that it matches the accuracy of a related diffusion-based method while outperforming it in terms of computational efficiency. Overall, CFMI performs at least as well as traditional methods on lower-dimensional data while remaining scalable to high-dimensional settings, matching or exceeding the performance of other deep learning-based approaches, making it a go-to imputation method for a wide range of data types and dimensionalities.

data quality, machine learning, missingness, (15 more...)

2506.09258

Country:

North America > United States > California (0.05)
North America > United States > Wisconsin (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Luo, Zhankun, Hashemi, Abolfazl

Unveiling the Cycloid Trajectory of EM Iterations in Mixed Linear Regression

arXiv.org Machine LearningJun-3-2024

We study the trajectory of iterations and the convergence rates of the Expectation-Maximization (EM) algorithm for two-component Mixed Linear Regression (2MLR). The fundamental goal of MLR is to learn the regression models from unlabeled observations. The EM algorithm finds extensive applications in solving the mixture of linear regressions. Recent results have established the super-linear convergence of EM for 2MLR in the noiseless and high SNR settings under some assumptions and its global convergence rate with random initialization has been affirmed. However, the exponent of convergence has not been theoretically estimated and the geometric properties of the trajectory of EM iterations are not well-understood. In this paper, first, using Bessel functions we provide explicit closed-form expressions for the EM updates under all SNR regimes. Then, in the noiseless setting, we completely characterize the behavior of EM iterations by deriving a recurrence relation at the population level and notably show that all the iterations lie on a certain cycloid. Based on this new trajectory-based analysis, we exhibit the theoretical estimate for the exponent of super-linear convergence and further improve the statistical error bound at the finite-sample level. Our analysis provides a new framework for studying the behavior of EM for Mixed Linear Regression.

iteration, log 1, tanh, (15 more...)

2405.18237

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Pazira, Hassan, Massa, Emanuele, Weijers, Jetty AM, Coolen, Anthony CC, Jonker, Marianne A

Bayesian Federated Inference for Survival Models

arXiv.org Machine LearningApr-26-2024

In cancer research, overall survival and progression free survival are often analyzed with the Cox model. To estimate accurately the parameters in the model, sufficient data and, more importantly, sufficient events need to be observed. In practice, this is often a problem. Merging data sets from different medical centers may help, but this is not always possible due to strict privacy legislation and logistic difficulties. Recently, the Bayesian Federated Inference (BFI) strategy for generalized linear models was proposed. With this strategy the statistical analyses are performed in the local centers where the data were collected (or stored) and only the inference results are combined to a single estimated model; merging data is not necessary. The BFI methodology aims to compute from the separate inference results in the local centers what would have been obtained if the analysis had been based on the merged data sets. In this paper we generalize the BFI methodology as initially developed for generalized linear models to survival models. Simulation studies and real data analyses show excellent performance; i.e., the results obtained with the BFI methodology are very similar to the results obtained by analyzing the merged data. An R package for doing the analyses is available.

baseline hazard function, estimator, hazard function, (14 more...)